Multilingual corpora: models, methods, uses

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Corpora for Cooperation

MLCC was a corpus, acquisition project funded by the EC Telematics program.The aim was to collect a set of texts representing a substantial improvement in range, quantity and quality of corpus material available. Two sub-corpora have been defined to help meet the needs for multilingual data consisting of a comparable set of texts in six languages and a parallel set of data in 9 languages. The c...

متن کامل

Pseudo-Aligned Multilingual Corpora

In machine translation, document alignment refers to finding correspondences between documents which are exact translations of each other. We define pseudo-alignment as the task of finding topical—as opposed to exact—correspondences between documents in different languages. We apply semisupervised methods to pseudo-align multilingual corpora. Specifically, we construct a topicbased graph for ea...

متن کامل

Automated Alignment in Multilingual Corpora

Experiences in computing alignments at the paragraph and sentence level within a project TRANSLEARN in the European Union's "LRE" programme of research and development in language engineering are reported. About 98% of the sentences in pairs of corpora in different languages have been aligned correctly by a method that uses dynamic programming on numbers of characters per sentence. This paralle...

متن کامل

Multilingual Aspects of Monolingual Corpora

If someone would collect opinions among the computational linguists what had been the most important trend in linguistics in the last decade, it is highly probable that the majority would answer that it was the massive use of large natural language corpora in many linguistic fields. The concept of collecting large amounts of written or spoken natural language data has become extremely important...

متن کامل

Building Strong Multilingual Aligned Corpora

Recent advances have allowed algorithms that learn from aligned natural language texts to exploit aligned sentences in more than two languages. We investigate ways of combining ( N 2 ) bilingual aligned corpora together to create a multilingual aligned corpus across N languages. As a result of the combination of several corpora, our algorithms output a multilingual corpus, with each aligned tup...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Tradterm

سال: 2004

ISSN: 2317-9511,0104-639X

DOI: 10.11606/issn.2317-9511.tradterm.2004.47044